REMARKS 

Claims 1-26 were pending at the time of examination. Claims 1, 12, 15 and 21 have been 
amended. Claims 2-3, 11, 16-17 and 22-23 have been canceled. The Applicant respectfully 
requests reconsideration based on the foregoing amendments and these remarks. 



Claim Rejections - 35 U.S.C. § 103 

Claims 1-26 were rejected under 35 U.S.C § 103(a) as being unpatentable over U.S. 
Patent Publication No. 2002/0052692 to Fahy (hereinafter "Fahy") in view of U.S. Patent No. 
6,973,459 to Yarmus et al. (hereinafter "Yarmus"). The applicant respectfully traverses these 
rejections. 

Claim 1, as amended, recites: 

"1. (Currently Amended) A processor-implemented non-iterative method of 
clustering a set of records, each of the records having attribute values for a set of 
attributes, the method comprising: 

for each attribute of the set of attributes, determining a characteristic value 
for said each attribute, the characteristic value being one of a mean value and a 
media value of the attribute values of said attribute across the records; 

for each attribute value, determining a deviation from the characteristic 
value of said each attribute; 

for each record, sorting the set of attributes based on deviations of the 
attribute values from the characteristic value of said each attribute, to provide a 
key; and 

combining the set of records based on the key into a clustering result that 
includes a plurality of clusters; 

wherein the key comprises an ordered list of the set of attributes and the 
deviations from the characteristic value of said each attribute; and 
refining the clustering result by: 

identifying a cluster having a smallest number of records; 

for each record of the identified cluster, searching another cluster 
having records with best matching keys; and 

distributing the cluster with the smallest number of records to the 
other cluster having records with best matching keys, to reduce the total 
number of clusters." 

The preamble of claim 1 has been amended to more clearly distinguish the claimed 
method from the methods of Fahy, by specifying that the method is a non-iterative method. As 
is well-known to those of ordinary skill in the art, the K-means algorithm used in Fahy is an 
iterative algorithm. That is, the K-means algorithm generally performs the following steps: 

1. Place K points into the space represented by the objects that are being 

clustered. These points represents K initial group centroids. 
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2. 



Assign each object to the group that has the closest centroid. 

When all objects have been assigned, recalculate the positions of the K 



3. 



centroids. 



4. 



Repeat steps 2 and 3 until the centorids no longer move. This produces a 
separation of the objects into groups from which the metric to be minimized 



can be calculated. 



In contrast, the method of claim 1 does not require multiple iterations. This is explained, 
for example, in paragraph [0015] of the Applicant's specification, which states "Furthermore, 
performance of the clustering method requires only two passes over the data." 

This non-iterative behavior of the method is further clarified in the first step of claim 1, 
where a characteristic value is determined for each attribute. In no subsequent step of claim 1 (or 
its dependent claims) is the characteristic value re-evaluated or re-determined. The characteristic 
value for an attribute is a value that in some fashion represents all values that appear for the 
attribute in the data record set. Expressed differently, the characteristic value can be thought of 
as a "lowest common denominator" against which all the other attributes values are compared in 
order to determine whether a particular attribute value "sticks out" from the group as a whole. In 
order to further clarify and define the characteristic value in claim 1 , the claim has been limited 
to characteristic values that are mean values or median values for the attribute values across the 
records. Again, this is different from the K-means algorithm of Fahy, which "tests a number of 
different groupings of the test subject rows into nonhierarchical clusters to search for a set of 
clusters that maximizes the similarity of all the test sample rows assigned to the same cluster" 
and "at the same time. . .maximizes the statistical distance or differences between individual 
clusters" (Fahy, paragraph [0047]). Furthermore, Fahy allows for a mixture of functions (e.g., 
mean for the first attribute, median for the second attribute, and so on), while in the present 
invention a single function (i.e., either mean or median) is used consistently across all attributes. 

Next, claim 1 requires "for each attribute value, determining a deviation from the 
characteristic value of said each attribute." This is also different from the K-means algorithm. 
Whereas similar types of distance or deviation measures may be used in the two methods, the 
points from which the distance is determined are different. In the K-means algorithm, the 
distance is iteratively determined from a centroid, which is, in a sense, a "moving target" as the 
position of the centroid changes in every iteration. In the Applicant's method, however, the 
deviation is determined with respect to the characteristic value, which does not change for a 
given attribute and a given set of records. 
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Claim 1 further requires "for each record, sorting the set of attributes based on deviations 
of the attribute values from the characteristic value of said each attribute, to provide a key;" and 
"combining the set of records based on the key into a clustering result that includes a plurality of 
clusters;" That is, first a sorting is performed based on the deviations from the characteristic 
value to obtain a key. The key is subsequently used to combine the records into a plurality of 
clusters. It should be noted that the language has been changed from "clustering the set of 
records. . ." to "combining the set of records. . ." in order to further clarify the invention. The new 
wording is intended to avoid potential confusion, as claim 1 as a whole describes a clustering 
method and the word "clustering" might otherwise imply some kind of recursion in this non- 
iterative method of claim 1 . Furthermore, the key is defined in claim 1 as comprising "an 
ordered list of the set of attributes and the deviations from the characteristic value." 
Respectfully, these steps are not shown in Fahy for the following reasons. 

There are two types of clustering in Fahy; a non-hierarchical clustering, and a 
hierarchical clustering. The non-hierarchical clustering is done using the K-means algorithm, 
and does not sort the set of attributes "based on deviations of the attribute values from the 
characteristic value of said each attribute, to provide a key," or cluster the records based on the 
key, as required by claim 1. The K-means algorithm certainly does not specify a key that 
comprises "an ordered list of the set of attributes and the deviations from the characteristic 
value." 

The hierarchical clustering determines "distances between clusters of cluster pairs to 
amalgamate clusters" and "These differences reflect the dissimilarities between the clusters of a 
cluster pair" (Fahy, paragraph [0055]). That is, the hierarchical clustering cannot be performed 
without already having defined clusters (in the case of Fahy, non -hierarchical clusters). In any 
event, determining distances between clusters is different from ". . .sorting the set of attributes 
based on deviations of the attribute values from the characteristic value of said each attribute, to 
provide a key;" and "clustering the set of records based on the key into a clustering result that 
includes a plurality of clusters," as is required by claim 1. 

Furthermore, the Examiner has still not explained what part of Fahy she considers to be 
equivalent to the key recited in claim 1, as has been kindly requested by the Applicant in 
previous Office Action responses. Again, the Applicant would greatly appreciate if the 
Examiner could clarify her reasoning on this issue. 

The last limitation of claim 1 has been further defined by incorporating limitations 
similar to the limitations of claim 1 1 (now canceled), to recite 

"refining the clustering result by: 
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identifying a cluster having a smallest number of records; 

for each record of the identified cluster, searching another cluster having 

records with best matching keys; and 

distributing the cluster with the smallest number of records to the other 
cluster having records with best matching keys, to reduce the total number of 
clusters." 

These features are not shown in Fahy, as previously admitted by the Examiner. Instead 

the Examiner relies on Yarmus for this showing. In particular, the Examiner cites col. 16, line 

34 - col. 17, line 6, and col. 14, lines 35-41 of Yarmus to render obvious the above claim 

limitation. The Applicant respectfully disagrees. The first section describes how to use a 

minimum description length (MDL) to rank predictors in an Adaptive Bayes Network (ABN) 

model, by calculating a description length of each predictor and ranking the predictors by the 

description length from smallest to largest. The same section further describes how to use the 

MDL to construct conditionally independent features, by attempting to extend single predictor, 

seed features in rank order. If no candidate extension is smaller than the current baseline, then 

the extension process terminates and a new seed is considered. Otherwise the smallest 

description length candidate becomes the new feature baseline. The second cited section reads: 

"MDL has been used to construct conditionally independent components 
which, individually correlate with the target. The details are described in the 
section Using MDL below. This section answer the following questions. How 
does the conglomerate Bayes Model perform? Do additional components increase 
or decrease predictive accuracy? The task of the pruning phase is to select the 
best subset model." 

While Yarmus addresses the idea of extending single predictor seed features in an ABN 
model and comparing the extensions to find a smallest one, Yarmus does certainly not 
render obvious the steps of finding a smallest cluster , searching another cluster having 
records with best matching keys , and distributing one cluster to another, in the specific 
manner that is described in claim 1. 

For at least these reasons, it is respectfully submitted that neither Fahy nor 
Yarmus, alone or in combination, anticipates or renders obvious claim 1 under 35 U.S.C 
§ 103(a). Even if one were to accept the Examiner's general motivation that a person of 
ordinary skill in the art would be motivated to combine Fahy and Yarmus, since "Fahy 
and Yarmus are both of the same endeavor to changing (or reducing) the clustering size 
of a set of records based on the K-mean clustering (or binning) processing" (Office 
Action, page 4), the Examiner has failed to show a reasonable expectation of success 
when combining the two alternative clustering and binning methods of Fahy and Yarmus, 
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respectively, for reducing the size of a dataset. Finally, the combination of the references 
must teach or suggest all the claim limitations. It should be clear from the above 
discussion that the references, alone or in combination, fail to teach several of the 
limitations recited in claim 1, as amended. Thus, it is respectfully submitted that the 
Examiner has failed to show a prima facie case of obviousness against claim 1, as 
amended, based on the cited references. For at least these reasons, it is respectfully 
requested that the rejection of claim 1 be withdrawn. 

For reasons substantially similar to those set forth above, the Applicant respectfully 
contends that the rejection of claims 15 and 21 is unsupported by the cited art and should be 
withdrawn. 

Dependent claims 2-3, 11, 16-17 and 22-23 have been canceled in view of the 
amendments made to claims 1, 15 and 21, respectively. Dependent claims 4-10, 12-14, 18-20, 
and 24-26 depend from claims 1, 15 and 21, respectively. They also specify further limitations 
distinguishing the Applicant's invention from the Fahy/Y annus combination. Thus, for at least 
the reasons discussed above with respect to claims 1, 15, and 21, the rejection of these dependent 
claims is unsupported by the cited art of record. 

Conclusion 

The Applicant believes that all pending claims are allowable and respectfully requests a 
Notice of Allowance for this application from the Examiner. Should the Examiner believe that a 
telephone conference would expedite the prosecution of this application, the undersigned can be 
reached at the telephone number set out below. 

Respectfully submitted, 
MOLLBORN PATENTS 

/Fredrik Mollborn/ 

Fredrik Mollborn, Reg. No. 48,587 

2840 Colby Drive 
Boulder, CO 80305 
(303) 459-4527 
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